An Analysis of a Combined Hardware-software Mechanism for Speculative Loads

نویسندگان

  • Stefanos Damianakis
  • Kai Li
چکیده

This paper describes a simple hardware mechanism and related compiler support for software-controlled speculative loads. The compiler issues speculative load instructions based on anticipated data references and the ability of the memory system to hide memory latency in high-performance processors. The architectural support for such a mechanism is simple and minimal, yet handles faults gracefully. We have simulated three speculative load mechanisms based on a MIPS processor and a detailed memory system. The results of scientiic kernel loops indicate that speculative load techniques can hide memory latency eeectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Architecture-Compatible Code Boosting for Performance Enhancement of the IBM RS/6000

are four main areas in which we see opportunities for future work. The first is in measuring the effect of hardware extensions to our current machine model for supporting unsafe code boosting. The second is implementing a software mechanism similar to those proposed by Bernstein et al. [10] for proving the safety of speculative loads and measuring its impact on performance. The third is augment...

متن کامل

Performance Evaluation of Configurable Hardware Features on the AMD-K5

Many modern processors incorporate certain configurable hardware features, although these features are never publicized. For instance, the AMD-K5 incorporates the ability to disable branch prediction, put caches into write allocate mode, etc. The ability to configure the features by software combined with the availability of on-chip performance counters allow the direct measurement of the perfo...

متن کامل

Exploring Thread-Level Speculation in Software: The Effects of Memory Access Tracking Granularity

Speculative execution is often the only way to overcome dataflow-imposed limitations and exploit parallelism when dependences can be discovered only at run-time. It also facilitates automatic parallelization of programs that exhibit complicated memory access patterns, which make complete compile-time dependence analysis either impossible or extremely complicated. A number of approaches for coar...

متن کامل

A Chip-Multiprocessor Architecture with Speculative Multithreading

ÐMuch emphasis is now placed on chip-multiprocessor (CMP) architectures for exploiting thread-level parallelism in an application. In such architectures, speculation may be employed to execute applications that cannot be parallelized statically. In this paper, we present an efficient CMP architecture for speculative execution of sequential binaries without source recompilation. We present the s...

متن کامل

Franklin and Sohi : Arb - a Hardware Mechanism for Dynamic Reordering of Memory

To exploit instruction level parallelism, it is important not only to execute multiple memory references per cycle, but also to reorder memory references-especially to execute loads before stores that precede them in the sequential instruction stream. To guarantee correctness of execution in such situations, memory reference addresses have to be disambiguated. This paper presents a novel hardwa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994